Ensemble Methods and Partial Least Squares Regression
نویسندگان
چکیده
Recently, there has been an increased attention in the literature on the use of ensemble methods in multivariate regression and classification. These methods have been shown to have interesting properties both for regression and classification. In particular, they can improve the accuracy of unstable predictors. Ensemble methods have so far, been little studied in situations that are common for calibration and prediction in chemistry, i.e., situations with a large number of collinear x-variables and few samples. These situations are often approached by data compression methods such as principal components regression (PCR) or partial least squares regression (PLSR). The present paper is an investigation of the properties of different types of ensemble methods used with PLSR in situations with highly collinear x-data. Bagging and data augmentation by simulated noise are studied. The focus is on the robustness of the calibrations. Real and simulated data is used. The results show that ensembles trained on data with added noise can make the PLSR robust against the type of noise added. In particular, the effects of sample temperature variations can be eliminated. Bagging does not seem to give any improvement over PLSR for small and intermediate number of components. It is, however, less sensitive to overfitting.
منابع مشابه
Comparison of Ensemble Strategies in Online NIR for Monitoring the Extraction Process of Pericarpium Citri Reticulatae Based on Different Variable Selections.
Different ensemble strategies were compared in online near-infrared models for monitoring active pharmaceutical ingredients of Traditional Chinese Medicine. Bagging partial least square regression and boosting partial least square regression were adopted to near-infrared models, to determine hesperidin and nobiletin content during the extraction process of Pericarpium Citri Reticulatae in a pil...
متن کاملA Online NIR Sensor for the Pilot-Scale Extraction Process in Fructus Aurantii Coupled with Single and Ensemble Methods
Model performance of the partial least squares method (PLS) alone and bagging-PLS was investigated in online near-infrared (NIR) sensor monitoring of pilot-scale extraction process in Fructus aurantii. High-performance liquid chromatography (HPLC) was used as a reference method to identify the active pharmaceutical ingredients: naringin, hesperidin and neohesperidin. Several preprocessing metho...
متن کاملPartial Least Squares Random Forest Ensemble Regression as a Soft Sensor
Six simple, dynamic soft sensor methodologies with two update conditions were compared on two experimentally-obtained datasets and one simulated dataset. The soft sensors investigated were: moving window partial least squares regression (and a recursive variant), moving window random forest regression, feedforward neural networks, mean moving window, and a novel random forest partial least squa...
متن کاملDetermination of Protein and Moisture in Fishmeal by Near-Infrared Reflectance Spectroscopy and Multivariate Regression Based on Partial Least Squares
The potential of Near Infrared Reflectance Spectroscopy (NIRS) as a fast method to predict the Crude Protein (CP) and Moisture (M) content in fishmeal by scanning spectra between 1000 and 2500 nm using multivariate regression technique based on Partial Least Squares (PLS) was evaluated. The coefficient of determination in calibration (R2C) and Standard Error of Calibra...
متن کاملEnsemble learning with trees and rules: Supervised, semi-supervised, unsupervised
In this article, we propose several new approaches for post processing a large ensemble of conjunctive rules for supervised, semi-supervised and unsupervised learning problems. We show with various examples that for high dimensional regression problems the models constructed by post processing the rules with partial least squares regression have significantly better prediction performance than ...
متن کامل